Shape-based modeling of the fundamental frequency contour for emotion detection in speech

نویسندگان

  • Juan Pablo Arias
  • Carlos Busso
  • Néstor Becerra Yoma
چکیده

This paper proposes the use of neutral reference models to detect local emotional prominence in the fundamental frequency. A novel approach based on functional data analysis (FDA) is presented, which aims to capture the intrinsic variability of F0 contours. The neutral models are represented by a basis of functions and the testing F0 contour is characterized by the projections onto that basis. For a given F0 contour, we estimate the functional principal component analysis (PCA) projections, which are used as features for emotion detection. The approach is evaluated with lexicon-dependent (i.e., one functional PCA basis per sentence) and lexicon-independent (i.e., a single functional PCA basis across sentences) models. The experimental results show that the proposed system can lead to accuracies as high as 75.8% in binary emotion classification, which is 6.2% higher than the accuracy achieved by a benchmark system trained with global F0 statistics. The approach can be implemented at sub-sentence level (e.g., 0.5 sec segments), facilitating the detection of localized emotional information conveyed within the sentence. The approach is validated with the SEMAINE database, which is a spontaneous corpus. The results indicate that the proposed scheme can be effectively employed in real applications to detect emotional speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A method of representing fundamental frequency contours of Japanese using statistical models of moraic transition

A statistical modeling of voice fundamental frequency contours was proposed for the purpose of developing effective ways to utilize prosodic features in speech recognition. In view of the fact that prosodic features should be treated in longer units, the proposed modeling represents the transition in moraic units. A fundamental frequency contour was rst segmented into moraic units and then each...

متن کامل

Affective Intonation-Modeling for Mandarin Based on PCA

The speech fundamental frequency (henceforth F0) contour plays an important role in expressing the affective information of an utterance. The most popular F0 modeling approaches mainly use the concept of separating the F0 contour into a global trend and local variation. For Mandarin, the global trend of the F0 contour is caused by the speaker’s mood and emotion. In this paper, the authors addre...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

Nearly perfect detection of continuous f_0 contour and frame classification for TTS synthesis

We present a new method for the estimation of a continuous fundamental frequency (F0) contour. The algorithm implements a global optimization and yields virtually error-free F0 contours for high quality speech signals. Such F0 contours are subsequently used to extract a continuous fundamental wave. Some local properties of this wave, together with a number of other speech features allow to clas...

متن کامل

Continuous Speech Recognition of Japanese Using Prosodic Word Boundaries Detected by Mora Transition Modeling of Fundamental Frequency Contours

An HMM-based method of detecting prosodic word boundaries was developed for Japanese continuous speech and was successfully integrated into a mora-basis continuous speech recognition system with two stages operating without and with prosodic information. The method is based on modeling the fundamental frequency (F0) contour of input speech as transitions of mora-unit F0 contours and operates af...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2014